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WO 00/15782 PCT/US99/20942 
RIBOSOMAL FRAMESHIFT TARGETS 



Background of the Invention 

Maintenance of correct reading frame during translation of mRNA is fundamental to the integrity 
5 of the translation process and, ultimately, to cell growth and viability. However, a number of cases have 
been identified in which translating ribosomes are directed to shift reading frames, a phenomenon 
referred to as "programmed ribosomal frameshifting". Most of these ribosomal frameshift events have 
been observed in RNA viruses. Families of mammalian viruses in which ribosomal frameshifting has 
been observed include retroviruses, coronaviruses, toroviruses. arteri viruses, astroviruses. and 

10 paramyxovirus. Plant viruses in which frameshifting has been observed include tetraviruses. and 
tombusviruses. In fungi, ribosomal frameshifting has been observed in the totiviruses and many 
retrotransposable elements. Among bacteriophages, ribosomal frameshifting has been documented in T7 
and X. Viral frameshifting events typically produce fusion proteins in which the N- and C-terminal 
domains are encoded by two distinct, overlapping open reading frames. Ribosomal frameshifting in 

15 viruses determines the stoichiometric ratio of structural (Gag) to enzymatic (Gag-pol) proteins, and plays 
a critical role in viral panicle assembly. The study of these ribosomal frameshifts has been important 
both because of their critical role in viral morphogenesis, and because of the information they provide 
about the mechanisms by which reading frame is normally maintained. 

The cis-acting sequences that promote efficient ribosomal frameshifting in the - I (5') direction 

20 have been well characterized in several viral systems and it has been convincingly demonstrated that the 
basic molecular mechanisms governing programmed -1 ribosomal frameshifting are almost identical from 
yeast to humans. Two basic sequence elements are required to promote efficient ievels of programmed - 
1 ribosomal frameshifting. The first sequence element is heptamer sequence. X XX Y YYZ (wherein the 
0-frame is indicated by spaces) called the "slippery site". The simultaneous slippage of ribosome-bound 

25 A- and P-site tRNAs by one base in the 5' direction still leaves their non-wobble bases correctly paired 
with the mRNA in the new reading frame. The second promoting element is usually a sequence that 
forms a defined RNA secondary structure, such as an RNA pseudoknot. located within 8 nucleotides 3' of 
the slippery site, and is thought to increase the probability that the ribosome will shift reading frame in 
the -1 direction. The number of ribosomes that shift frame is affected by a number of parameters. 

30 including the ability of the ribosome bound tRNAs to unpair from the 0-frame, the ability of these tRNAs 
to rebind to the -1 frame, the relative position of the RNA pseudoknot from the slippery site and the 
thermodynamic stability of the pseudoknot. 

There are a few documented examples in which programmed ribosomal frameshifting is utilized 
by mRNAs of cellular origin. In E. coli, autoregulation of a programmed +1 ribosomal frameshift in the 
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prfB gene is required for the synthesis of Release Factor 2 (RF2) (Craigen and Caskey, 1986; Craigen et 
al.. 1985; Donly et al., 1990a; Donly et al.. 1990b), and a -1 ribosomal frameshift in the dnaX gene 
generates the DNA polymerase gamma subunit (Flower and McHenry, 1991; Blinkowa and Walker, ■ 
1990; Tsuchihashi and Kornberg, 1990). In eukaryotic mRNAs, programmed +1 ribosomal frameshifting 
5 has been demonstrated in genes encoding ornithine decarboxylase (ODC) Antizyme isolated from rat, 
mouse, xenopus, drosophila (Hayashi and Murakami, 1995; Ivanov et al., 1998: Kankare et al., 1997; 
Ichiba et al., 1995; Matsufuji et al., 1995; Rom and Kahana, 1994), and in the EST3 gene of S. cerevisiae 
(Lundblad and Morris, 1997). In mammalian cells, the control of ribosomal frameshifting efficiency is 
autoregulated by ODC Antizyme protein levels (Craigen and Caskey, 1986; Craigen et al., 1985; Donly 

10 et al., 1990a; Hayashi and Murakami, 1995; Matsufuji et al., 1995). In yeast cells which lack ODC 
Antizyme, high concentrations of putrecine and consequently low concentrations of spermidine promote 
increased efficiencies of frameshifting in the +1 direction (Balasundaram et al., 1994b; Balasundaram et 
al., 1994a). Thus, the regulation of polyamine biosynthesis demonstrates how programmed ribosomal 
frameshifting may be used by eukaryotic cellular genes as a post-transcriptional regulatory mechanism. 

15 Although there are no known examples of eukaryotic cellular mRNAs which utilize programmed -1 
ribosomal frameshifting, certain observations suggest that this mechanism may also be biologically 
relevant for these cells as well. Certain yeast strains harboring chromosomal mutations which increase 
the efficiency of -1 ribosomal frameshifting (mof = maintenance of jfame) show cellular defects as well, 
e.g. temperature sensitive cell cycle growth arrest, temperature-sensitive mating defects, mitochondrial 

20 defects, sensitivity to translational inhibitors, inability to degrade nonsense mRNAs. and slow growth 
phenotypes (Cui et al., 1996; Dinman and Wickner, 1992; Dinman and Wickner, 1994). These 
observations suggest that -1 ribosomal frameshifting may play a role in the regulation of cellular gene 
expression, and that changes in the efficiency of -1 ribosomal frameshifting may affect cell growth and 
replication. 

25 Based on the hypothesis that biological systems tend to conserve and use functional molecular 

regulatory mechanisms, a computer search program was designed to identify consensus -1 ribosomal 
frameshift signals in large DNA databases. It was found that consensus -1 ribosomal frameshift signals 
occur with frequencies significantly greater than random in these databases. It was also demonstrated 
that one of the predicted -1 ribosomal frameshift signals, occurring at the 5' end of the yeast Rasl mRNA. 

30 promotes efficient levels of -1 ribosomal frameshifting in the yeast 5. cerevisiae. 

Summary of the Invention 

In accordance with the present invention, it has been discovered that gene sequences which have 
the frameshifting sequences exist in many organisms other than viruses. Frameshifting sequences have 
35 been newly identified in numerous yeast, avian, and mammalian sequences. 
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A computer search was designed to search for consensus -1 ribosomal frameshift signals (motif 
hits) present in the EMBL virus, Saccharomyces cerevisiae, human mRNA, cDNA and Expressed 
Sequence Tag (EST) databases. These searches found that potential -1 ribosomal frameshifting signals 
occur at frequencies greater than one order of magnitude above random chance. This result provides 
5 strong theoretical evidence for the existence of a subset of cellular genes which are regulated at the 
translational level by -1 ribosomal frameshifting in eukaryotes. and that this post transcriptional 
regulatory mechanism is widely used by many different families of viruses as well. 

The present invention provides a method of identifying a nucleic acid sequence involved in 
ribosomal frameshifting. The method comprises 1) searching a database of gene sequences to identify 
10 sequences which contain the sequence XXX YYY Z , wherein XXX represents GGG, AAA, TTT or 
CCC, YYY represents AAA or TTT. Z represents A, T, or C and wherein XXXYYYZ is not 
AAAAAAA or TTTTTTT; and 2) further searching among those sequences identified in step 1 for a 
sequence encoding a pseudoknot structure which is within eight nucleotides of the sequence identified in 
step 1. 

15 The present invention also provides a method of identifying a nucleic acid sequence involved in 

ribosomal frameshifting, comprising the steps of selecting a gene sequence having a sequence of 
nucleotides from the group of GGG, AAA, TTT and CCC; selecting said gene sequence having an 
adjacent sequence of nucleotides from the group of AAA and TTT; selecting said gene sequence having 
a nucleotide from the group of A. T and C, said nucleotide adjacent to said adjacent sequence of 

20 nucleotides; excluding said gene sequence wherein said sequence of nucleotides is AAA, said adjacent 
sequence of nucleotides is AAA and said nucleotide is A; excluding said gene sequence wherein said 
sequence of nucleotides is TTT, said adjacent sequence of nucleotides is TTT and said nucleotide is T; 
searching for an encoded pseudoknot structure which starts within eight nucleotides of said selected gene 
sequence. 

25 The present invention further provides a system for identifying a nucleic acid sequence involved 

in ribosomal frameshifting, the system comprising access means for accessing a database of gene 
sequences; selection means for selecting a particular gene sequence from said database of gene 
sequences, said particular gene sequence having a sequence of nucleotides from the group of GGG, 
AAA. TTT and CCC, an adjacent sequence of nucleotides from the group of AAA and TTT, a nucleotide 

30 from the group of A, T and C, said nucleotide adjacent to said adjacent sequence of nucleotides, wherein 
said particular gene sequence is excluded from selection when said sequence of nucleotides is AAA. said 
adjacent sequence of nucleotides is AAA and said nucleotide is A and said particular gene sequence is 
excluded from selection when said sequence of nucleotides is TTT. said adjacent sequence of nucleotides 
is TTT and said nucleotide is T; pseudoknot search means for locating an encoded pseudoknot structure 

35 which starts within eight nucleotides of said selected gene sequence. 
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The present invention also provides a method of regulating expression of a mammalian gene 
comprising modulating the frequency of ribosomal frameshifting during translation of messenger RNA. 



Brief Description of the Drawings 
5 Figure 1: Consensus programmed -1 ribosomal frameshift signal. 

Figure 2: Conservation of two frameshift signals in homologous genes from different organisms. 

Detailed Description of the Invention 

The present invention provides a method of identifying a nucleic acid sequence involved in 

10 ribosomal frameshifting. The method comprises searching a database of gene sequences to identify 
nucleic acid sequences which contain a slippery site and a pseudoknot structure associated with 
frameshifting. The method comprises first searching for a slippery site, which is identified by the 
sequence XXX YYY Z, wherein XXX represents GGG, AAA, TTT or CCC; YYY represents AAA or 
TTT; Z represents A, T, or C; and wherein XXXYYYZ is not AAAAAAA or TTTTTTT. Further 

15 searching is conducted among those sequences containing a slippery site for a sequence encoding a 
pseudoknot structure which is within eight nucleotides of the slippery site sequence. 

The slippery site may have any of the following nucleic acid sequences: GGG AAA A, GGG 
AAA T, GGG AAA C, AAA AAA T, AAA AAA C, TTT AAA A, TTT AAA T, TTT AAA C, CCC 
AAA A. CCC AAA T, CCC AAA C, GGG TTT A, GGG TTT T, GGG TTT C, AAA TTT A, AAA 

20 TTT T, AAA TTT C, TTT TTT A, TTT TTT C, CCC TTT A , CCC TTT T, and CCC TTT C. 

The present invention also provides a method of identifying a nucleic acid sequence involved in 
ribosomal frameshifting, comprising the steps of selecting a gene sequence having a sequence of 
nucleotides from the group of GGG. AAA, TTT and CCC; selecting said gene sequence having an 
adjacent sequence of nucleotides from the group of AAA and TTT; selecting said gene sequence having 

25 a nucleotide from the group of A, T and C, said nucleotide adjacent to said adjacent sequence of 

nucleotides; excluding said gene sequence wherein said sequence of nucleotides is AAA, said adjacent 
sequence of nucleotides is AAA and said nucleotide is A; excluding said gene sequence wherein said 
sequence of nucleotides is TTT, said adjacent sequence of nucleotides is TTT and said nucleotide is T; 
searching for an encoded pseudoknot structure which starts within eight nucleotides of said selected gene 

30 sequence. 

The present invention further provides a system for identifying a nucleic acid sequence involved 
in ribosomal frameshifting, the system comprising access means for accessing a database of gene 
sequences; selection means for selecting a particular gene sequence from said database of gene 
sequences, said particular gene sequence having a sequence of nucleotides from the group of GGG, 
35 AAA, TTT and CCC, an adjacent sequence of nucleotides from the group of AAA and TTT, a nucleotide 
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from the group of A, T and C, said nucleotide adjacent to said adjacent sequence of nucleotides, wherein 
said particular gene sequence is excluded from selection when said sequence of nucleotides is AAA, said 
adjacent sequence of nucleotides is AAA and said nucleotide is A and said particular gene sequence is 
excluded from selection when said sequence of nucleotides is TTT, said adjacent sequence of nucleotides 

5 is TTT and said nucleotide is T; pseudoknot search means for locating an encoded pseudoknot structure 
which starts within eight nucleotides of said selected gene sequence. 

Translation of any gene containing frameshift sequences, namely the slippery site and 
psuedoknot sequences, is potentially regulated by the ribosomal frameshifting mechanism. 
Consequently, translation of such a gene may be regulated by known methods of altering the frequency 

10 of frameshifting, for example, by use of drugs which affect the peptidyl transferase activity. 

Accordingly, the invention provides a method of regulating expression of a mammalian gene comprising 
modulating the frequency of ribosomal frameshifting during translation of messenger RNA. In 
accordance with the method, the frequency of frameshifting may be increased or decreased. 



1 5 Computer search protocols. 

The GenBank Saccharomyces cerevisiae. Homo sapiens, Mus musculus, Rattus non>egicus, 
Gallus gallus, Sus scrofa, Drosophila melanogaster, and Virus divisions, and 2 x 10 J random sequences 
of 10 3 bases (G-C content = 50%) were searched using the following algorithmic structure: 
Step 1: Search forXXXYYYZ (slippery site) where: 
20 XXX = GGG, AAA, TTT or CCC 

YYY = AAA or TTT 

Z = A, T, or C 

AND XXXYYYZ * AAAAAAA or TTTTTTT. 
Step 1 can be implemented by selecting a gene sequence having a sequence of nuceotides from the group 

25 of GGG, AAA, TTT and CCC; selecting the gene sequence having an adjacent sequence of nucleotides 
from the group of AAA and TTT; selecting the gene sequence having a nucleotide from the group of A, 
T and C, the nnucleotide adjacent to the adjacent of nucleotides; excluding the gene sequence wherein 
the sequence of nucleotides is AAA, the adjacent sequence of nucleotides is AAA and the nucleotide is 
A; and excluding the gene sequence wherein the sequence of nucletides is TTT, the adjacent sequence of 

30 nucleotides is TTT and the nucleotide is T. 

Step 2: Search for a pseudoknot 3' of the XXXYYYZ slippery site motif using the GenoBase program. 
Further constraints placed on the pseudoknot were: 

a. The pseudoknot must begin within 8 nucleotides (NT) of base Z; 

b. Stem 1 must have a minimum length of 6 base pairs, containing no more than 1 
35 mismatch, 1 insertion or 1 deletion; 
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c. Gap 1 (the gap between stem 1 and stem 2) can be no greater than 3 NT in length; 

d. Stem 2 must have a minimum of 5 base pairs with only 1 insertion, deletion or 
mismatch allowed; 

e. Gap 2 can be no greater than 3 NT in length; 
5 f. Gap 3 is limited to 100 NT in length. 

Step 3: Align motifs found in steps 1 and 2 with an open reading frame (ORF) of at least 50 codons, 
such that the first base in the slippery site (the first X) is in the third base of a codon. Further, searching 
in the 5' direction of the motif there must be an in-frame ATG codon before a translational termination 
signal (TAA, TAG, or TGA). Sequences that satisfied all of these criteria were defined as "motif hits". 
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Strains, media , genetic methods, and plasmid construction. 

E. coli strain DH5 was used for plasmid preparations, and transformations of E. Coli and S. 
Cerevisiae were performed (Dinman and Wickner, 1992). YPAD and synthetic complete medium "were 
prepared (Dinman and Wickner. 1994). The S. cerevisiae strain JD88 {MATa ura3-52 lys2-801 ade2-10 
5 trpl ) [L-AHNB] [Mi]) was used for in vivo measurements of -1 ribosomal frameshifting efficiencies as 
described in (Dinman and Wickner, 1992). 

pJD160.0 is based on p314-JD86-ter (Cui et al., 1996), with the modification that it contains 
unique Bam HI. Sma I and Kpn I restriction endonuclease recognition sites 3' of the AUG start codon, 
and 5' of the lacZ gene. This is the 0-frame control plasmid. pJD160.-l is identical to pJD160.0 except 
10 that lacZ is in the -1 frame with respect to the translational start site without any intervening frameshift 
signal. This is used to measure unprogrammed -1 ribosomal frameshifting. The frameshift signals from 
the yeast RAS] gene was amplified from genomic DNA by polymerase chain reaction (PCR) as described 
(Costa and Weiner, 1995) using the synthetic oligonucleotide primers shown in Table 1. 

15 Table 1. Oligonucleotid Primers used in this study 

Oligonucleotide Primer Description 



5' AAAGA.47TCCGACATGCAGGGAAATCCAAATCAAC 3 1 (SEQ RAS 1 5' Eco RI. 
IDNO:l) 

5' CCCCGGTACCGTCATCGATGACAACTT 3' (SEQ ID NO:2) RAS 1 3' Kpn I. 



Italicized bases denote added restriction endonuclease recognition sites. Bold bases indicate 
gene sequence. Underlined bases were added 3' of the slippery site and 5' of the predicted mRNA 
pseudoknot forming region so that a -1 ribosomal frameshift will direct elongating ribosomes into the 
original reading frame. 

20 

Since the RAS1 frameshift signal is predicted to direct ribosomes into premature termination 
signals, two additional nucleotides were added in the spacer regions between the slippery sites and 
pseudoknots of these PCR products such that a -1 frameshift would re-direct ribosomes into the original 
reading frame. The PCR products were cloned into pJD160.0 to produce pJD160.RASl. In this 
25 construct, a programmed -1 frameshift is required for in order for the lacZ gene to be translated. 

RESULTS 

The program is capable of finding known viral programmed -1 ribosomal frameshift signals. 
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As a positive control, the program was used to search all 36.556 loci of the GenBank virus 

division, revealing 1077 motif hits. The program identified almost all of the known viral -1 ribosomal 

frameshift signals including those that have been classically used to study programmed -1 ribosomal 

frameshifting. These include Mouse Mammary Tumor Virus, Feline Leukemia Virus, and Infectious 

5 Bronchitis Virus. As expected, the program was not able to identify the motif hit in Rous Sarcoma Virus 

because the Gaps 1 and 2 represented in Figure 1 are larger than allowed by the program. In addition, 

many motif hits were identified in families of viruses where -1 ribosomal frameshifting has not been 

described. For example, a frameshift motif appears to be well conserved in the E1B protein large T- 

antigen mRNA among the adenoviruses, and in the VP 16 family of proteins in many of the herpesviruses. 

10 

Consensus motif hits occur at frequencies significantly greater than random in the genome 

databases. 

If a subset of cellular genes utilize programmed -1 ribosomal frameshifting, then it may be 
assumed that the consensus frameshift motifs should be present in the genomes of many different species 

15 at frequencies significantly greater than random. To test this, the probability of the random occurrence of 
a motif hit was determined. The program was run twice against 10 4 randomly generated sequences of 10 3 
bases. For technical reasons, the G:C content was set to 50%. This negative control found 41 motif hits 
in the First run and 42 in the second. Thus, the random frequency of motif hits is 83 per 2 x 10 7 bases. 
Searches of the large DNA databases revealed that motif hits occur with frequencies significantly greater 

20 than random (Table 2). 



Table 2. Summary of search results. 



Organism 


# Bases 
Searched 


# Motif 
Hits 


Fold > Random 


Random sequence 


2.0 xlO 7 


83 




Saccharomyces cerevisiae (yeast) 


1.2 xlO 7 


260 


5.22 


Homo sapiens (human) 


9.52 xlO 7 


1055 


2.67 


Mus musculus (mouse) 


2.13 xlO 7 


320 


3.62 
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Rattus norvegicus (rat) 


1.14 xlO 7 


103 


2.18 


Gallus gallus (chicken) 


2.37 xlO 6 


57 


5.8 


Sus scrofa (pig) 


1.5 xlO 6 


25 


4.02 


Drosophila melartogaster 
(fruitfly) 


I.l6xl0 7 


167 


3.47 


Viruses 


3.7 xlO 7 


1077 


7.0 



The results from the S. cerevisiae genome should provide the best estimate of the frequency of 
motif hits, because 1) it is complete, 2) it is on the same order of magnitude as the random control. 3) it 
contains the least amount of duplications, and 4) it was sequenced without reading-frame bias. Analysis 
of this dataset revealed 260 motif hits, approximately 5.2-fold more frequent than random. BLAST 
5 analysis revealed that 153 different recognized genes or CDS were represented. Since the yeast genome 
is estimated to contain approximately 5900 genes, these data suggest that at least 2.55% of the genes in 
the yeast genome contain at least one consensus programmed -1 ribosomal frarneshift signal. Further, 
since the algorithm limited the size of gapl and gap2 and disallowed slippery sites of TTTTTTT and 
AAAAAAA , the data probably represent an underestimate of the fraction of motif hits containing yeast 
10 genes. 

Frarneshift signals appear to be evolutionarily conserved between homologous genes in different 

species. 

If a subset of cellular genes utilize programmed -1 ribosomal frameshifting, then specific 
frarneshift signals would be evolutionarily conserved in homologous genes from different organisms. A 

15 preliminary comparison of the locations and structures of motif hits in homologous genes in the different 
databases reveals cases where nearly identical motif hits appear to be conserved. Two such examples, a 
comparison of Fibrillin 2 in human and mouse, and of the Sulfonurea Receptor in humans and rat are 
shown in Fig. 2. It is notable that whereas the slippery sites and stems of the motifs are highly 
conserved, the lengths of gap3, which are not expected to play a critical role, are variable in both of these 

20 examples. Thus it appears that the biologically important elements of the frarneshift signals have been 
conserved, while the unimportant elements have been allowed to drift. 

Mutations that have been linked to inherited human diseases correlate with those that are 
predicted to abolish -1 ribosomal frameshifting. 
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If programmed -1 ribosomal frameshifting has a biologically relevant function in cellular gene 

expression, then there should be a correlation between mutations that disrupt frameshifting by altering 

the -1 ribosomal frameshift signal, and human alleles that have been linked to genetically inherited 

diseases. This hypothesis predicts that the disease alleles would encode missense mutations, or the 

5 addition or deletion of entire codons. A preliminary analysis of the human motif hit database identified 

four alleles of three genes that fit these criteria (Table 3). 

Table 3: Three Human Genes Where Specific Mutations in the Consensus -1 Ribosomal 
Frameshifting Signals Have Been Linked to Disease. 



Description 


Diseases and allelic variants*. 


ETFA-electron transfer flavoprotein 
a-subunit precursor 


Type II glutaricaciduria. Note: allelic variant .0004 
(Val270DEL3bp) disrupts the spacing between the 
slippery site and the RNA pseudoknot. 


Triacylglycerol lipase 


Lipoprotein Lipase Deficiency. Note: allelic 
variant .O027(Arg75Ser) disrupts stem 1 of the 
RNA pseudoknot. Familial Chylomicronemia 
Syndrome. Note: allelic variant .002 1 (Trp86Arg) 
disrupts stem 2 of the RNA pseudoknot. 


FASL receptor 


Autoimmune lymphoproliferative syndrome. Note: 
allelic variant .0007 (Tyr216Cys) disrupts stem2 in 
the RNA pseudoknot. 



10 

*The human diseases that are known to be linked to these genes. References to these can be found in the 
Online Mendelian Inheritance in Man (OMIM) database on the WorldWideWeb. 

In the human gene encoding triacylglycerol lipase, the .0027 allelic variant of triacylglycerol 
15 lipase (linked to lipoprotein lipase deficiency) (Wilson et al., 1993), and the .0021 allelic variant (linked 
to Familial Chylomicronemia Syndrome) (Gotoda et al., 1992) are both predicted to disrupt the RNA 
pseudoknot component of the consensus -1 ribosomal frameshift signal. Similarly, the .0007 allelic 
variant of the FASL antigen (linked to autoimmune lymphoproliferative syndrome) (Bettinardi et al., 
1997) is also predicted to disrupt the RNA pseudoknot. Disruption of the mRNA pseudoknot is predicted 

10 
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to abolish programmed -1 ribosomaJ frameshifting (reviewed in Dinman, 1995; Jacks. 1996; Farabaugh, 

1996; Brierley, 1995; Gesteland and Atkins. 1996: Dinman et al., 1998; TenDam et a!., 1990). In 

addition, the .0004 allele of the ETFA-electron transfer flavoprotein ct-subunit precursor (linked to type 

II giutaricaciduria) (Freneaux et al., 1992) disrupts the spacing between the slippery site and the RNA 

5 pseudoknot, which is predicted to result in a decrease in programmed -1 ribosomal frameshifting 

efficiency (Dinman and Wickner, 1992; Brierley et al., 1991; Brierley et al., 1992; Morikawa and 

Bishop. 1992). 

In summary, a computer implemented method has been developed that is capable of detecting 
known viral - 1 ribosomal frameshift signals. We have demonstrated that these motif hits occur with 
10 frequencies approximately one order of magnitude greater than random in many large DNA sequence 
databases, and there are examples where the consensus frameshift signals appear to be evolutionarily 
conserved in homologous genes in different organisms. Finally, three examples are shown where single 
missense mutations that occur in the frameshift signal correspond with previously identified genetically 
inherited diseases in humans. 

15 

Computer identified motif hits can promote efficient levels of programmed -1 ribosomal 

frameshifting in S. cerevisiae. 
Using a series of frameshift reporter plasmids and yeast strains previously developed, a set of 
motif hits that were identified by the computer program were tested for ability to promote efficient levels 

20 of programmed -1 ribosomal frameshifting in intact cells. Plasmids to monitor programmed ribosomal 
frameshifting were previously described (Cui et al., 1996; Dinman et al., 1997;m Dinman and Kinzy, 
1997; Turner et al., 1998; Cui et al.. 1998). Briefly, in all of these plasmids, transcription is driven from 
the yeast PGK1 promoter into an AUG translational start site. The E. coli lac'Z gene serves as the 
reporter, and transcription termination utilizes the yeast PGK1 transcriptional terminator. In the pO 

25 plasmids, lacZ is in the 0-frame with respect to the translational start site, and measurement of (3- 

galactosidase activity generated from cells transformed with these piasmids serve as the 0-frame controls. 
In the p-1 series, the predicted programmed -1 ribosomal frameshift signals have been cloned into unique 
Bam HI and Sma I sites in pO. Thus, in the p-1 series of plasmids, lacZ is in the -1 frame with respect to 
the translational start site, and is 3' of a predicted programmed -1 ribosomal frameshift signal such that 

30 (3-galactosidase can only be produced as a consequence of a programmed -1 ribosomal frameshift. pO 
and p-1 are introduced into yeast cells in parallel, and the amount of the lacZ gene product (B- 
galactosidase) present in both sets of cells are determined. Motif hits amplified by PCR from yeast 
genomic DNA were cloned into pJD160 in such a way that a programmed -1 ribosomal frameshift is 
required for translation of the lacZ gene. This set constitutes the frameshift test piasmids. Programmed 

35 -1 ribosomal frameshift efficiencies were calculated by dividing the (3-galactosidase activities generated 
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from cells harboring frameshift test plasmids by the B-galactosidase activity generated by the 0-frame 

control. pJD160. As a control to determine the background levels of unprogrammed -1 frameshifting, P- 

galactosidase activities generated from cells harboring pJD160.-l were determined. Further, the- 

efficiency of programmed -1 ribosomal frameshifting as promoted by the L-A virus frameshift signal was 

5 determined in order to compare the frameshift promoting abilities of the motif hits to a known 

programmed -1 ribosomal frameshift signal. The results of these experiments demonstrate that the motif 

hits that were tested are all capable of promoting efficient programmed -1 ribosomal frameshifting as 

compared to the L-A frameshift signal (Table 4). 

10 Table 4. Motif hits can promote efficient levels of programmed -1 ribosomal frameshifting in intact 

yeast cells. 



Frameshift signal 


% -1 ribosomal frameshifting 


L-A dsRNA virus 


1.9% 


RAS1 


4.4% 



Discussion 

15 Following the hypothesis that biological systems tend to conserve usable regulatory 

mechanisms, a computer program was developed based on an algorithm describing a set of consensus 
programmed -1 ribosomal frameshift signals. It has been demonstrated 1) that the program is capable of 
finding known frameshift signals, 2) that these motif hits occur in the large DNA databases at frequencies 
that are significantly greater than random, 3) that very similar motif hits can be found to be evoiutionarily 

20 conserved in homologous genes from different species, 4) that known missense alleles that have been 
linked to human diseases are predicted to disrupt frameshift signals, and 5) that at least one motif hit 
from the yeast S. Cerevisiae genome is capable of promoting efficient levels of programmed -1 ribosomal 
frameshifting. These findings indicate that, in addition to viruses, programmed -1 ribosomal 
frameshifting is also utilized to regulate the expression of chromosomally encoded genes in eukaryotes. 

25 

Possible regulatory roles of programmed -1 ribosomal frameshifting. 

There are three possible translational outcomes of a programmed ribosomal frameshift. A 
frameshift could result in the production of an extended fusion protein such as the viral gag-pol protein. 
In the context of cellular proteins, there are many imaginable consequences of the addition of a C- 
30 terminal domain. For example, such a domain could provide a means to physically localize the protein to 
a different compartment. An additional C-terminal domain could encode an enzymatic or signaling 
function, or even provide an autoregulatory function. A programmed ribosomal frameshift could also 
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result in the production of two proteins having identical N-terminal domains and different C-termini. In 

addition to the consequences listed above, such an outcome could also result in a bifurcation function. 

For example, the two proteins could have identical input functions (e.g. can both act as a receptor for the 

same ligand), but different output functions (e.g. transduction of the signal to different regulatory 

5 pathways). Thus, programmed ribosomal frameshifting could be utilized by cells to effect activity in 

different biological regulatory pathways. 

A third possible outcome is that programmed ribosomal frameshifting results in a premature " 

termination event. Such an event may signal to the translational complex that the mRNA being 

translated contains a nonsense mutation. mRNAs which contain nonsense mutations are rapidly 

10 degraded via the nonsense-mediated mRNA decay (NMD) pathway (reviewed in Weng et al., 1997). 
The rate of mRNA decay plays an important role in the regulation of gene expression, and the decay rate 
of an mRNA can be modulated, depending on the cell type, stage of the cell cycle, or environmental 
conditions (see Atwater et al., 1990; Cleveland and Yen, 1989; Peltz et al., 1991 for reviews). It has been 
shown that aberrant regulation of post-transcriptionai control mechanisms can lead to disease (reviewed 

15 in Ross, 1995). Altered stability of certain mRNAs has been suggested to be an important factor in 
determining the onset and severity of disease. Examples include the differences in the stability between 
the wild-type c-myc mRNA and its tranlocated formed found in Burketts lymphoma; between the highly 
oncogenic v-fos mRNA and its weakly oncogenic c-fos mRNA (reviewed in Weng et al., 1997; Lee et al., 
1988; Raymond et al., 1989) and between mRNAs encoding the oncogenic E6/E7 proteins of the 

20 nonintegraed human papilomaviruses found in benign cervical lesions and the more stable E6/E7 mRNAs 
synthesized from the integrated form of the virus that correlates with cervical carcinomas (Jeon and 
Lambert, 1995). Further, mutations in trans -acting factors that regulate mRNA turnover may also lead to 
aberrant gene regulation and disease. Mutations in rra/w-acting factors specifically stabilize the 
lymphokine GM-CSF mRNA in monocytic tumors compared with non-tumor cells (Schuler and Cole, 

25 1988). 

As noted above, both the RAS1 and STE5 programmed ribosomal frameshift signals fall into this 
class, promoting approximately 5% of translating ribosomes to encounter premature termination signals. 
One concern is the biological significance of a mere 5% efficiency of frameshifting in that is this would 
result in an insignificant 5% change in overall Rasl protein concentrations. However, this does not take 

30 into account the fact that a -1 ribosomal frameshift would lead to the premature translational termination 
of that specific mRNA molecule. As such, a frameshift event on a specific mRNA would trigger the 
destruction of that mRNA, and thus these frameshift signals should act as mRNA destabilizing elements, 
decreasing the overall stability of all of those mRNAs. For example, in the absence of a frameshift 
signal, each mRNA might be translated 100 times, resulting in the production of 100 protein molecules 

35 per mRNA. In the presence of the signal however, a frameshift efficiency of 5% would result in 1 in 20 
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translating ribosomes encountering a premature termination signal on each individual mRNA, activating 

NMD pathway. Thus, each mRNA would be limited to producing an average of only 19 of 20 protein 

molecules, an 80% reduction in the total amount of protein synthesized. Thus we propose that 

programmed ribosomal frameshifting may be used by a subset of cellular mRNAs as a general 

5 mechanism to regulate their stability and consequently the abundance of their encoded protein products. 

The abundance of a subset of cellular mRNAs may be biologically regulated by modulation of 

programmed -1 ribosomal frameshifting efficiencies. As noted above, the rate of mRNA decay plays an 

important role in the regulation of gene expression, and the decay rate of an mRNA can be modulated, 

depending on the cell type, stage of the cell cycle, or environmental conditions. Thus, programmed -1 

10 ribosomal frameshifting may be used as a mechanism to regulate the abundance of a subset of cellular 

mRNAs. The possibilities for signaling mechanisms that may act to modulate programmed -1 ribosomal 

frameshift efficiencies are numerous. These may include the cell-cycle, heat shock, and developmental, 

and other signals. 

The recent observation that anisomycin specifically inhibits programmed -1 ribosomal 

1 5 frameshifting (Dinman et al., 1997) provides a potentially intriguing link between regulation of 

programmed ribosomal frameshifting and the control of cell growth and division. There is a considerable 
body of literature describing the ability of anisomycin to activate the Jun kinase/stress-activated protein 
kinase (JNK/SAPK) pathway (reviewed in Shu et al.. 1996; Moxham et al., 1996). Anisomycin 
stimulates expression of the c-jun. c-fos and c-myc proto-oncogenes (Yu et al., 1996; Moxham et al., 

20 1996; Kawasaki et al., 1996; Hazzalin et al., 1996), activates the MAP-kinases (Moxham et al., 1996; 
Hazzalin et al., 1996; Nahas et al.. 1996; Cano et al., 1996), pre-ribosomal S6. histone H3 and HMG-14 
(Hazzalin et al.. 1996), ELAM-1 (Gersa et al., 1992), angiotensin II (Yu et al., 1996), the Ras-dependent 
and Ras-independent pathways (Kawasaki et al., 1996), p38/RK (yeast Hoglp) (Nahas et al., 1996; Cano 
etal.. 1996), MEK6 (Stein et al., 1996), and insulin-like growth factor II (Nielsen et al., 1995). The 

25 effects of anisomycin are specific: other protein synthesis inhibitors {e.g. cycloheximide or emetine) 
block cell cycle progresssion without strong JNK/SAPK induction (Shu et al., 1996). 

Anisomycin inhibits protein translation at the level of elongation. It has been proposed that 
inhibition of protein synthesis leads to a decrease in the levels of labile negative growth regulating 
proteins, thus promoting cell growth and division (Gersa et al., 1992; Smailov et al., 1993; Rosenwald et 

30 al.. 1995; Abdelmajid et al., 1993). According to this hypothesis however, any general inhibitor of 
translation should result in this effect, and thus the JNK/SAPK pathway should be nonspecificaliy 
induced by any inhibitor of protein synthesis. This is not the case since 1) not all translaiional inhibitors 
stimulate this pathway, and 2) pathway-specific induction is observed. Since anisomycin decreases the 
efficiency of programmed -1 ribosomal frameshifting efficiencies, it is believed that the regulation of 

35 expression of proteins involved in the JNK/SAPK signaling pathway occurs at the post-transcriptional 
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level by regulating efficiencies of ribosomal frameshifting rather than by generally inhibiting protein 

synthesis. This model retains the suggestion that there is a labile element lied to specific inhibitors of 

protein synthesis, but that it is mRNA instead of protein. Thus, anisomycin likely causes an increase in 

the abundance of these labile cellular mRNAs which encode positive growth regulators by decreasing 

5 programmed ribosomal frameshifting efficiencies. In normal growth these mRNAs would promote 

ribosomes to shift reading frame into early termination codons, making these mRNAs substrates for the 

nonsense-mediated mRNA decay pathway. These mRNAs would normally be non-abundant species with 

short half-lives and low production of their encoded protein products. However, under certain 

conditions, they could be stabilized as a consequence of decreased efficiencies of ribosomal 

10 frameshifting. Stabilization of these mRNAs would upregulate the expression of their encoded products, 

which presumably are positive regulators of cell growth and division. The ability to specifically regulate 

the half-lives, and thereby the abundance, of mRNAs containing -1 ribosomal frameshift signals provides 

the cell with a level of specificity that the labile negative growth regulating protein model cannot account 

for. 

15 Several lines of evidence are consistent with this model. First, anisomycin should stabilize nonsense- 
mRNAs. It has been demonstrated that anisomycin acts post-transcriptionally by stabilizing the ELAM-1 
mRNA and other nonsense-containing mRNAs (Gersa et al., 1992; Li et al., 1996), and that anisomycin 
regulates the expression of prepro-IGF-II in a post-transcriptional manner (Nielsen et al., 1995). Second, 
if anisomycin induces cell proliferation by decreasing -1 ribosomal frameshifting efficiencies in a 

20 specific set of mRNAs, then sparsomycin should have anti-proliferative effects by virtue of its ability to 
increase -1 ribosomal frameshifting efficiencies (seeDinman et al., 1997). Sparsomycin analogs have 
been demonstrated to have antitumor activities (Hofs et al., 1995a: Hofs et al., 1995b; Hofs et al.. 1994). 
Third, in three of the well characterized examples of non-viral programmed ribosomal frameshifting. alt 
involve autoregulatory feedback mechanisms where levels of the encoded protein products affect the 

25 efficiencies of ribosomal frameshifting along their own mRNAs (reviewed in Gesteland and Atkins, 
1996). These examples where ribosomal frameshifting efficiency is autoregulated provide further 
support for the hypothesis that programmed ribosomal frameshifting can be used to regulate the 
abundance and expression of cellular mRNAs and their encoded products. 

All of the publications cited herein or listed below are cited for background purposes and the 

30 disclosure of such publications is not essential for an understanding of the invention. All of the 
publications are hereby incorporated by reference. 
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What is claimed is: 

1 . A method of identifying a nucleic acid sequence involved in ribosomal frameshifting comprising: 
1) searching a database of gene sequences to identify sequences which contain the sequence 

5 XXX YYY Z, wherein 

XXX represents GGG, AAA, TTT or CCC, 
YYY represents AAA or TTT, 
Z represents A, T, or C 
and wherein XXXYYYZ is not AAAAAAA or TTTTTTT; 
10 2) further searching among those sequences identified in step 1 for a sequence encoding a 

pseudoknot structure which is within eight nucleotides of the sequence identified in step 1. 

2. The method of claim I, wherein XXXYYYZ represents a sequence selected from the group 
consisting of GGG AAA A, GGG AAA T, GGG AAA C, AAA AAA T, AAA AAA C, TTT AAA A, 

15 TTT AAA T, TTT AAA C, CCC AAA A, CCC AAA T, CCC AAA C, GGG TTT A, GGG TTT T, 
GGG TTT C, AAA TTT A, AAA TTT T, AAA TTT C, TTT TTT A, TTT TTT C, CCC TTT A, CCC 
TTT T, and CCC TTT C. 

3. A method of identifying a nucleic acid sequence involved in ribosomal frameshifting comprising 
20 the steps of: 

selecting a gene sequence having a sequence of nucleotides from the group of GGG. AAA, TTT 
and CCC; 

selecting said gene sequence having an adjacent sequence of nucleotides from the group of AAA 
and TTT; 

25 selecting said gene sequence having a nucleotide from the group of A. T and C, said nucleotide 

adjacent to said adjacent sequence of nucleotides; 

excluding said gene sequence wherein said sequence of nucleotides is AAA, said adjacent 
sequence of nucleotides is AAA and said nucleotide is A; 

excluding said gene sequence wherein said sequence of nucleotides is TTT, said adjacent 
30 sequence of nucleotides is TTT and said nucleotide is T; 

searching for an encoded pseudoknot structure which starts within eight nucleotides of said 
selected gene sequence. 

4. The method of claim 3 wherein XXXYYYZ represents a sequence selected from the group 
35 consisting of GGG AAA A, GGG AAA T, GGG AAA C. AAA AAA T, AAA AAA C, TTT AAA A, 
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TTT AAA T, TTT AAA C, CCC AAA A, CCC AAA T, CCC AAA C, GGG TTT A, GGG TTT T, 

GGG TTT C, AAA TTT A, AAA TTT T, AAA TTT C, TTT TTT A, TTT TTT C, CCC TTT A, CCC 

TTT T. and CCC TTT C. 

5 5. A system for identifying a nucleic acid sequence involved in ribosomal frameshifting, the system 
comprising: 

access means for accessing a database of gene sequences; 

selection means for selecting a particular gene sequence from said database of gene sequences, 
said particular gene sequence having a sequence of nucleotides from the group of GGG, AAA, TTT and 

10 CCC, an adjacent sequence of nucleotides from the group of AAA and TTT, a nucleotide from the group 
of A, T and C, said nucleotide adjacent to said adjacent sequence of nucleotides, wherein said particular 
gene sequence is excluded from selection when said sequence of nucleotides is AAA, said adjacent 
sequence of nucleotides is AAA and said nucleotide is A and said particular gene sequence is excluded 
from selection when said sequence of nucleotides is TTT, said adjacent sequence of nucleotides is TTT 

15 and said nucleotide is T; 

pseudoknot search means for locating an encoded pseudoknot structure which starts within eight 
nucleotides of said selected gene sequence. 

6. The system as recited in claim 5 wherein XXXYYYZ represents a sequence selected from the 
20 group consisting of GGG AAA A, GGG AAA T, GGG AAA C, AAA AAA T, AAA AAA C, TTT AAA 
A, TTT AAA T, TTT AAA C, CCC AAA A, CCC AAA T, CCC AAAC, GGG TTT A, GGG TTT T, 
GGG TTT C. AAA TTT A, AAA TTT T, AAA TTT C, TTT TTT A, TTT TTT C, CCC TTT A, CCC 
TTT T, and CCC TTT C. 

25 7. A method of regulating expression of a mammalian gene comprising modulating the frequency of 
ribosomal frameshifting during translation of messenger RNA. 

8. The method according to claim 7, wherein the frequency of frameshifting is increased. 
30 9. The method according to claim 7, wherein the frequency of frameshifting is decreased. 

10. The method according to claim 7, wherein the gene encodes an oncogene. 

1 1. The method according to claim 7, wherein the gene encodes a tumor suppresser gene. 

35 
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The method according to claim 7, wherein the gene encodes a hormone. 



13. The method according to claim 7, wherein the gene encodes a human growth hormone. 
5 14. The method according to claim 7, wherein the gene encodes a hormone receptor. 

15. The method according to claim 7, wherein the gene encodes a human growth hormone receptor. 

16. The method according to claim 6, wherein the gene encodes a catalytic enzyme. 

10 

17. A method of treating a disease caused by reduced expression of a gene product which is 
produced as a result of ribosomal frameshifting, comprising increasing the frequency of ribosomal 
frameshifting during translation of the gene. 

15 18. A method of treating a disease caused by increased expression of a gene product which is 
produced as a result of ribosomal frameshifting, comprising decreasing the frequency of ribosomal 
frameshifting during translation of the gene. 
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SEQUENCE LISTING 



<110> Dinman, Jonathan D. 
Peltz, Stuart W. 

«:120> RIBOSOMAL FRAMESHIFT TARGETS 

<130> UMDNJ-31060 

<140> 
<141> 

<150> 60/100285 
<1S1> 1998-09-14 

<160> 2 

<170> Patentln Ver. 2.0 

<210> 1 
<211> 35 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: primer 
<400> 1 

aaagaattcc gacatgcagg gaaatccaaa tcaac 

<210> 2 
<211> 27 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer 
<400> 2 

ccccggtacc gtcatcgatg acaactt 
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